Cosine distribution (continuous)#

A compact-support, symmetric distribution on an interval of length \(2\pi\) with density proportional to \(1+\cos(x)\). It’s a simple, smooth alternative to a uniform distribution when you want more mass in the middle and zero density at the endpoints.

In this notebook you’ll:

  • Define the PDF/CDF (including location/scale form)

  • Derive key moments (mean/variance) and the likelihood

  • Implement NumPy-only sampling (accept–reject)

  • Visualize PDF, CDF, and Monte Carlo samples

  • Use scipy.stats.cosine for practical work

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy import stats

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)

rng = np.random.default_rng(42)

1) Title & Classification#

  • Name: cosine distribution

  • Type: continuous

  • Standard support: \(x \in [-\pi,\,\pi]\)

  • Standard parameters: none (fixed shape)

SciPy exposes it as a location–scale family:

  • Parameters: loc = \mu \in \mathbb{R}, scale = \sigma > 0

  • Support: \(x \in [\mu-\pi\sigma,\,\mu+\pi\sigma]\)

2) Intuition & Motivation#

What it models#

The cosine distribution is a bounded, symmetric, unimodal distribution with density that smoothly decreases from the center to the edges.

Think of it as a “bumped” uniform distribution on \([-\pi,\pi]\):

  • Uniform: constant density on the interval.

  • Cosine: density \(\propto 1+\cos(x)\), so it peaks at \(x=0\) and hits 0 at \(\pm\pi\).

Typical use cases#

  • A simple model or prior for a quantity constrained to an interval (especially an angle difference mapped to \([-\pi,\pi]\)) with central tendency.

  • A pedagogical example of a finite-support distribution with closed-form CDF and tractable moments.

  • Lightweight generative modeling on bounded domains (possibly as a mixture component).

Relations to other distributions#

  • Location–scale family of the standard cosine distribution.

  • Closely related to the raised cosine shape (common in signal processing).

  • Via a transformation, it connects to the Wigner semicircle distribution: if \(U\) has semicircle density on \([-1,1]\), then \(X = 2rcsin(U)\) follows the cosine distribution.

3) Formal Definition#

PDF#

Standard form:

\[ f(x) = rac{1+\cos(x)}{2\pi}, \qquad x\in[-\pi,\pi], \]

and \(f(x)=0\) otherwise.

Location–scale form (SciPy’s parameterization):

\[ f(x\mid \mu,\sigma) = rac{1+\cos\left( rac{x-\mu}{\sigma} ight)}{2\pi\sigma}, \qquad x\in[\mu-\pi\sigma,\,\mu+\pi\sigma],\ \sigma>0. \]

CDF#

For the standard form:

\[ F(x) = egin{cases} 0, & x < -\pi,\ rac{x+\sin(x)+\pi}{2\pi}, & -\pi \le x \le \pi,\ 1, & x > \pi. \end{cases} \]

For \((\mu,\sigma)\): \(F(x\mid\mu,\sigma) = F_0\left( rac{x-\mu}{\sigma} ight)\) where \(F_0\) is the standard CDF above.

PI = np.pi


def cosine_support(loc: float = 0.0, scale: float = 1.0) -> tuple[float, float]:
    if scale <= 0:
        raise ValueError("scale must be > 0")
    return loc - PI * scale, loc + PI * scale


def cosine_pdf(x, loc: float = 0.0, scale: float = 1.0):
    x = np.asarray(x)
    if scale <= 0:
        raise ValueError("scale must be > 0")
    z = (x - loc) / scale
    inside = (-PI <= z) & (z <= PI)
    out = np.zeros_like(z, dtype=float)
    out[inside] = (1.0 + np.cos(z[inside])) / (2.0 * PI * scale)
    return out


def cosine_cdf(x, loc: float = 0.0, scale: float = 1.0):
    x = np.asarray(x)
    if scale <= 0:
        raise ValueError("scale must be > 0")
    z = (x - loc) / scale
    out = np.empty_like(z, dtype=float)
    out[z < -PI] = 0.0
    out[z > PI] = 1.0
    inside = (-PI <= z) & (z <= PI)
    out[inside] = (z[inside] + np.sin(z[inside]) + PI) / (2.0 * PI)
    return out


def cosine_logpdf(x, loc: float = 0.0, scale: float = 1.0):
    x = np.asarray(x)
    if scale <= 0:
        raise ValueError("scale must be > 0")
    z = (x - loc) / scale
    inside = (-PI <= z) & (z <= PI)

    cosz = np.cos(z)
    cosz = np.clip(cosz, -1.0, 1.0)

    out = np.full_like(z, -np.inf, dtype=float)
    with np.errstate(divide="ignore"):
        out[inside] = np.log1p(cosz[inside]) - np.log(2.0 * PI * scale)
    return out


# Quick sanity checks
x_grid = np.linspace(-PI, PI, 20001)
pdf_grid = cosine_pdf(x_grid)

mass_trapz = np.trapz(pdf_grid, x_grid)
print("Numerical integral of PDF over support:", mass_trapz)

# Match SciPy
max_abs_pdf_err = np.max(np.abs(pdf_grid - stats.cosine.pdf(x_grid)))
max_abs_cdf_err = np.max(np.abs(cosine_cdf(x_grid) - stats.cosine.cdf(x_grid)))
print("max |pdf - scipy.pdf|:", max_abs_pdf_err)
print("max |cdf - scipy.cdf|:", max_abs_cdf_err)
Numerical integral of PDF over support: 1.0
max |pdf - scipy.pdf|: 5.551115123125783e-17
max |cdf - scipy.cdf|: 1.1102230246251565e-16

4) Moments & Properties#

Let \(X\) denote the standard cosine distribution on \([-\pi,\pi]\).

Mean, variance, skewness, kurtosis#

  • By symmetry, \(\mathbb{E}[X]=0\).

  • The variance is

\[ \operatorname{Var}(X) = rac{\pi^2}{3} - 2. \]
  • Skewness is 0 (symmetric distribution).

  • The fourth moment is

\[ \mathbb{E}[X^4] = rac{\pi^4}{5} - 4\pi^2 + 24, \]

so the (excess) kurtosis is

\[ \gamma_2 = rac{\mathbb{E}[X^4]}{\operatorname{Var}(X)^2} - 3. \]

For the location–scale version \(Y = \mu + \sigma X\):

  • \(\mathbb{E}[Y]=\mu\)

  • \(\operatorname{Var}(Y)=\sigma^2\left( rac{\pi^2}{3}-2 ight)\)

  • Skewness and kurtosis are unchanged by location–scale.

MGF / characteristic function#

The MGF exists for all real \(t\) (bounded support):

\[ M_X(t) = \mathbb{E}[e^{tX}] = rac{\sinh(\pi t)}{\pi t\,(1+t^2)}. \]

The characteristic function is

\[ arphi_X(\omega) = \mathbb{E}[e^{i\omega X}] = rac{\sin(\pi\omega)}{\pi\omega\,(1-\omega^2)}, \]

with removable singularities at \(\omega\in\{0,\pm 1\}\).

Entropy#

The differential entropy (in nats) for the standard form is

\[ h(X) = \ln(4\pi) - 1. \]

For \(Y = \mu + \sigma X\), \(h(Y)=h(X)+\ln\sigma\).

# Closed-form moment constants (standard form)
var0 = PI**2 / 3.0 - 2.0
m4 = PI**4 / 5.0 - 4.0 * PI**2 + 24.0
kurtosis_excess0 = m4 / (var0**2) - 3.0
entropy0 = np.log(4.0 * PI) - 1.0

print("Var[X] =", var0)
print("Excess kurtosis =", kurtosis_excess0)
print("Entropy (nats) =", entropy0)

# Monte Carlo verification
n = 400_000
x_mc = stats.cosine.rvs(size=n, random_state=rng)

mean_mc = x_mc.mean()
var_mc = x_mc.var()

m4_mc = np.mean((x_mc - mean_mc) ** 4)
kurtosis_excess_mc = m4_mc / (var_mc**2) - 3.0

print()
print("Monte Carlo (n=%d)" % n)
print("mean ~", mean_mc)
print("var  ~", var_mc)
print("excess kurtosis ~", kurtosis_excess_mc)
Var[X] = 1.2898681336964528
Excess kurtosis = -0.5937628755982818
Entropy (nats) = 1.5310242469692907

Monte Carlo (n=400000)
mean ~ -0.0005897480997648315
var  ~ 1.2884274555095643
excess kurtosis ~ -0.5946739650596489

5) Parameter Interpretation#

SciPy’s cosine is the standard distribution with location and scale:

  • loc = \mu: shifts the center (mean/mode/median) to \(\mu\).

  • scale = \sigma: stretches the support from \([-\pi,\pi]\) to \([\mu-\pi\sigma,\mu+\pi\sigma]\) and rescales the height by \(1/\sigma\).

There are no additional shape parameters: the overall “cosine bump” shape is fixed.

# Visualize the effect of loc/scale
params = [
    dict(loc=0.0, scale=1.0, name="loc=0, scale=1"),
    dict(loc=0.0, scale=0.5, name="loc=0, scale=0.5"),
    dict(loc=0.0, scale=2.0, name="loc=0, scale=2"),
    dict(loc=1.0, scale=1.0, name="loc=1, scale=1"),
]

fig = go.Figure()
for p in params:
    lo, hi = cosine_support(p["loc"], p["scale"])
    x = np.linspace(lo, hi, 2000)
    y = cosine_pdf(x, p["loc"], p["scale"])
    fig.add_trace(go.Scatter(x=x, y=y, mode="lines", name=p["name"]))

fig.update_layout(
    title="Cosine PDF for different (loc, scale)",
    xaxis_title="x",
    yaxis_title="density",
)
fig.show()

6) Derivations#

Expectation (standard form)#

Because \(f(x)\) is even (symmetric about 0), \(x f(x)\) is odd, so

\[ \mathbb{E}[X] = \int_{-\pi}^{\pi} x\, rac{1+\cos x}{2\pi}\,dx = 0. \]

Variance (standard form)#

Since \(\mathbb{E}[X]=0\), \(\operatorname{Var}(X)=\mathbb{E}[X^2]\):

\[ \mathbb{E}[X^2] = \int_{-\pi}^{\pi} x^2\, rac{1+\cos x}{2\pi}\,dx = rac{1}{\pi}\int_{0}^{\pi} x^2(1+\cos x)\,dx. \]

Compute the two terms:

  • \(\int_0^\pi x^2 dx = \pi^3/3\).

  • By parts: \(\int_0^\pi x^2\cos x\,dx = -2\pi\).

Therefore

\[ \operatorname{Var}(X)= rac{1}{\pi}\left( rac{\pi^3}{3} - 2\pi ight)= rac{\pi^2}{3}-2. \]

Likelihood (location–scale)#

For i.i.d. data \(x_1,\dots,x_n\) under \((\mu,\sigma)\):

\[ \ell(\mu,\sigma) = \sum_{i=1}^n \log f(x_i\mid\mu,\sigma) = -n\log(2\pi\sigma) + \sum_{i=1}^n \log\left(1+\cos\left( rac{x_i-\mu}{\sigma} ight) ight) \]

subject to the support constraint \(x_i\in[\mu-\pi\sigma,\mu+\pi\sigma]\) for all \(i\).

There is no simple closed-form MLE; numerical optimization (as in SciPy’s fit) is typical.

def cosine_loglike(data, loc: float, scale: float) -> float:
    return float(np.sum(cosine_logpdf(np.asarray(data), loc=loc, scale=scale)))


# Example: simulate data and evaluate the log-likelihood surface around the truth
true_loc, true_scale = 0.5, 1.2
x = stats.cosine.rvs(loc=true_loc, scale=true_scale, size=500, random_state=rng)

loc_grid = np.linspace(true_loc - 0.6, true_loc + 0.6, 60)
scale_grid = np.linspace(0.6, 2.0, 70)

ll = np.empty((len(scale_grid), len(loc_grid)))
for i, s in enumerate(scale_grid):
    for j, m in enumerate(loc_grid):
        ll[i, j] = cosine_loglike(x, loc=m, scale=s)

# Plot as a heatmap
fig = go.Figure(
    data=go.Heatmap(
        z=ll,
        x=loc_grid,
        y=scale_grid,
        colorscale="Viridis",
        colorbar=dict(title="log-likelihood"),
    )
)
fig.add_trace(
    go.Scatter(
        x=[true_loc],
        y=[true_scale],
        mode="markers",
        marker=dict(color="red", size=10),
        name="true (loc, scale)",
    )
)
fig.update_layout(title="Log-likelihood surface (example)", xaxis_title="loc", yaxis_title="scale")
fig.show()

7) Sampling & Simulation (NumPy-only)#

Accept–reject sampling#

We can sample from the standard cosine distribution using a uniform proposal:

  • Proposal: \(Z \sim \mathrm{Unif}[-\pi,\pi]\) with density \(g(z)=1/(2\pi)\).

  • Target: \(f(z)=(1+\cos z)/(2\pi)\).

The ratio is

\[ rac{f(z)}{g(z)} = 1+\cos z \in [0,2], \]

so we can take \(M=2\) and accept with probability

\[ lpha(z) = rac{f(z)}{Mg(z)} = rac{1+\cos z}{2}. \]

This yields an average acceptance rate of \(1/M = 50\%\).

To sample the location–scale version, return \(X = \mu + \sigma Z\).

def cosine_rvs_numpy(
    n: int,
    *,
    loc: float = 0.0,
    scale: float = 1.0,
    rng: np.random.Generator | None = None,
) -> tuple[np.ndarray, float]:
    """Sample from the cosine distribution using NumPy-only accept-reject.

    Returns (samples, acceptance_rate).
    """

    if n < 0:
        raise ValueError("n must be >= 0")
    if scale <= 0:
        raise ValueError("scale must be > 0")
    if rng is None:
        rng = np.random.default_rng()

    out = np.empty(n, dtype=float)
    filled = 0

    proposals = 0
    accepted_total = 0

    # Expected acceptance rate is exactly 0.5 for the Uniform(-pi, pi) proposal.
    while filled < n:
        remaining = n - filled
        batch = int(max(256, 2.2 * remaining))

        z = rng.uniform(-PI, PI, size=batch)
        u = rng.random(size=batch)
        accept = u <= (1.0 + np.cos(z)) / 2.0

        accepted = z[accept]

        proposals += batch
        accepted_total += len(accepted)

        k = min(len(accepted), remaining)
        out[filled : filled + k] = accepted[:k]
        filled += k

    # Apply location-scale transform
    out = loc + scale * out

    acceptance_rate = accepted_total / proposals
    return out, acceptance_rate


samples, acc_rate = cosine_rvs_numpy(200_000, rng=rng)
print("acceptance rate ~", acc_rate)
print("sample mean ~", samples.mean())
acceptance rate ~ 0.5006909090909091
sample mean ~ -0.0017187970087170262

8) Visualization#

We’ll plot:

  • Theoretical PDF and CDF

  • Histogram of Monte Carlo samples with the theoretical PDF overlay

  • Empirical CDF vs theoretical CDF

# PDF and CDF (standard)
x = np.linspace(-PI, PI, 2000)

fig_pdf = go.Figure()
fig_pdf.add_trace(go.Scatter(x=x, y=cosine_pdf(x), mode="lines", name="PDF"))
fig_pdf.update_layout(title="Cosine PDF (standard)", xaxis_title="x", yaxis_title="density")
fig_pdf.show()

fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=x, y=cosine_cdf(x), mode="lines", name="CDF"))
fig_cdf.update_layout(title="Cosine CDF (standard)", xaxis_title="x", yaxis_title="probability")
fig_cdf.show()

# Histogram + overlay
fig_hist = px.histogram(samples, nbins=80, histnorm="probability density", title="Monte Carlo samples")
fig_hist.add_trace(go.Scatter(x=x, y=cosine_pdf(x), mode="lines", name="theoretical PDF"))
fig_hist.update_layout(xaxis_title="x", yaxis_title="density")
fig_hist.show()

# Empirical CDF vs theoretical CDF
xs = np.sort(samples)
ys = np.arange(1, len(xs) + 1) / len(xs)

fig_ecdf = go.Figure()
fig_ecdf.add_trace(go.Scatter(x=xs, y=ys, mode="lines", name="empirical CDF"))
fig_ecdf.add_trace(go.Scatter(x=x, y=cosine_cdf(x), mode="lines", name="theoretical CDF"))
fig_ecdf.update_layout(title="Empirical vs theoretical CDF", xaxis_title="x", yaxis_title="probability")
fig_ecdf.show()

9) SciPy Integration (scipy.stats.cosine)#

SciPy provides a ready-to-use implementation with the usual methods:

  • .pdf(x, loc, scale) / .logpdf(...)

  • .cdf(x, loc, scale)

  • .rvs(size, loc, scale, random_state=...)

  • .fit(data) (estimates loc and scale)

Below is a small end-to-end example.

dist = stats.cosine

# Generate synthetic data
true_loc, true_scale = -0.3, 0.8
x = dist.rvs(loc=true_loc, scale=true_scale, size=2000, random_state=rng)

# Fit loc/scale
loc_hat, scale_hat = dist.fit(x)
print("true loc, scale:", (true_loc, true_scale))
print("fit  loc, scale:", (loc_hat, scale_hat))

# Compare fitted PDF to histogram
lo, hi = cosine_support(loc_hat, scale_hat)
x_plot = np.linspace(lo, hi, 2000)

fig = px.histogram(x, nbins=60, histnorm="probability density", title="Fit with scipy.stats.cosine")
fig.add_trace(go.Scatter(x=x_plot, y=dist.pdf(x_plot, loc=loc_hat, scale=scale_hat), mode="lines", name="fitted PDF"))
fig.add_trace(go.Scatter(x=x_plot, y=dist.pdf(x_plot, loc=true_loc, scale=true_scale), mode="lines", name="true PDF"))
fig.update_layout(xaxis_title="x", yaxis_title="density")
fig.show()
true loc, scale: (-0.3, 0.8)
fit  loc, scale: (-0.2959937908105679, 0.7922314131197986)

10) Statistical Use Cases#

Hypothesis testing#

  • Goodness-of-fit: test whether data are consistent with a cosine model (often after fitting loc/scale).

  • Model comparison: compare cosine vs uniform or other bounded distributions using likelihood-based criteria.

Below are two small examples:

  1. A KS test when parameters are known.

  2. A simple moment-based test that distinguishes uniform vs cosine on \([-\pi,\pi]\) using \(\mathbb{E}[\cos X]\).

Bayesian modeling#

Use the cosine distribution as a bounded, smooth prior/likelihood component when you want:

  • support on a finite interval

  • a single smooth mode

  • zero density at endpoints

Generative modeling#

As a base distribution for bounded variables (possibly with mixtures):

\[ X \sim \sum_{k=1}^K w_k\,\mathrm{Cosine}(\mu_k,\sigma_k). \]

This can approximate a variety of unimodal/multimodal shapes on bounded domains.

# 1) KS test with known parameters (valid when parameters are fixed a priori)
loc0, scale0 = 0.2, 1.1
x0 = stats.cosine.rvs(loc=loc0, scale=scale0, size=800, random_state=rng)

# Test against the exact CDF
D, p = stats.kstest(x0, "cosine", args=(loc0, scale0))
print("KS statistic:", D)
print("p-value     :", p)

# 2) Uniform vs cosine on [-pi, pi] via E[cos X]
# Under Uniform(-pi,pi): E[cos X]=0 and Var(cos X)=1/2
# Under Cosine: E[cos X]=1/2

x_u = rng.uniform(-PI, PI, size=2000)

def z_test_uniform_vs_cosine(x):
    m = np.mean(np.cos(x))
    se = np.sqrt(0.5 / len(x))
    return m / se

z_u = z_test_uniform_vs_cosine(x_u)
z_c = z_test_uniform_vs_cosine(stats.cosine.rvs(size=2000, random_state=rng))

print()
print("Z under uniform (should be ~N(0,1)):", z_u)
print("Z under cosine  (should be large +):", z_c)
KS statistic: 0.03356901014089397
p-value     : 0.3209829769831638

Z under uniform (should be ~N(0,1)): 0.7131579341389477
Z under cosine  (should be large +): 31.419646281119494

11) Pitfalls#

  • Invalid parameters: scale must be strictly positive.

  • Support matters: if any observation lies outside \([\mu-\pi\sigma,\mu+\pi\sigma]\), the likelihood is 0 (log-likelihood \(=-\infty\)).

  • Near-endpoint numerics: when \((x-\mu)/\sigma pprox \pm\pi\), then \(1+\cos(\cdot)pprox 0\) and logpdf will be very negative (as it should). Use log1p(cos(z)) for stability.

  • Angle data: if your variable is truly circular, you may want a wrapped/circular distribution (e.g. von Mises). The cosine distribution is on a line segment, not inherently periodic.

12) Summary#

  • The cosine distribution is a continuous, bounded, symmetric density on \([-\pi,\pi]\) with PDF \(\propto 1+\cos(x)\).

  • It has closed-form CDF, tractable moments, and a neat accept–reject sampler with a 50% acceptance rate.

  • In practice, scipy.stats.cosine handles evaluation, sampling, and fitting via (loc, scale).

References

  • SciPy: scipy.stats.cosine

  • “Raised cosine” distributions/shapes in signal processing texts (shape relation)